[SPARK-38214][SS]No need to filter windows when windowDuration is multiple of slideDuration#35526
[SPARK-38214][SS]No need to filter windows when windowDuration is multiple of slideDuration#35526nyingping wants to merge 25 commits intoapache:masterfrom nyingping:SPARK-38214
Conversation
improve structured streaming window of calculated
|
I'll check it later |
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/catalyst/src/main/scala/org/apache/spark/sql/catalyst/analysis/Analyzer.scala
Show resolved
Hide resolved
|
Is this a follow-up of #35362? Looks like a different one. But seems okay. Will re-check it later. |
|
Yeah I meant additional optimization along with previous one. Sorry if I confused you. |
|
Can one of the admins verify this patch? |
|
Sorry,it's my fault.I mixed the update history of the branch of the previous with the present, caused interference and misunderstanding. |
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
HeartSaVioR
left a comment
There was a problem hiding this comment.
+1 pending tests. Thanks for the contribution!
sql/core/src/test/scala/org/apache/spark/sql/DataFrameTimeWindowingSuite.scala
Outdated
Show resolved
Hide resolved
|
I'll leave this in a day to see the chance of another reviews from others. I'll merge this tomorrow if there's no new feedback. |
|
OK, no feedback on working hour in US timezone. Thanks! Merging to master. |
|
Thanks @nyingping for the contribution! I merged into master. |
|
@HeartSaVioR Thank you for review very much! |
What changes were proposed in this pull request?
At present, the sliding window adopts the form of expand + filter, but in some cases, filter is not necessary.
Filtering is required if the sliding window is irregular. When the window length is divided by the slide length the result is an integer (I believe this is also the case for most work scenarios in practice for sliding window), there is no need to filter, which can save calculation resources and improve performance.
Why are the changes needed?
save calculation resources and improve performance.
Does this PR introduce any user-facing change?
NO
How was this patch tested?
UT and benchmark.
simple benchmark in this commit ,thanks HeartSaVioR@d532b6f
-------case 1
Result:
-------case 2
Result: